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1. INTRODUCTION 

Requirements engineers are one of the key profiles within software development teams. They 
balances all project stakeholders’ expectations from idea to post production phases. Requiremens engineering 
is not solely a technical discipline. Additionally, it also has an inter-disciplinary nature that concerns 
Cognitive Psychology, Anthropology, Sociology, Linguistics and Philosophy aspects of the subject [1]. Thus, 
they have a dramatical impact on the success of the software products and their continuous competence 
development is critical. 

Functional size measurement (FSM) is an important task that is used for scoping, budgeting, 
managing outsourcing contracts, effort estimation, etc. This task is generally under the responsibility of 
system analysts or requirements engineers (REs). CFP is one of the recent FSM methods. Function Point 
variants are mainly used in software cost estimation [2] and productivity of the development organisations 
[3]. Function Point is also a good indicator in identifying business complexity of the software [4]. 
Additionally It’s CFP variant is also a strong tool for requirement quality and process improvement [5]. CFP 
measurement errors, made by requirements engineers, leads budget, schedule and quality problems in 
software projects. 
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Therefore, it’s crucial to foresee and plan requirements engineers’ CFP training need in a quick and 
correct manner. A recent paper points out that CFP training need should be represented more in higher 
education [6]. We think training is also critical in the workplace setting and REs should be continually 
developed in CFP competence when need arises. Training is the dominating factor for quality improvement 
of FSM [7]. Factors that cause inconsistent and inaccurate CFP measurements might be improved by training 
[8]. In this study, requirements engineers CFP training need has been forecasted by using the artifacts they 
produced in the workplace and machine learning algorithms. 

Data mining ors software analytics studies that use requirements engineering artifacts are scarce 
[9]. For example, one of these rare studies aims early test effort prediction by using UML diagrams [10]. On 
the other hand, software data mining studies which are based on sofware code and code change artifacts are 
common [9]. For instance, code smells in the source code have been investigated using Neural Network 
Models in a recent study [11]. We observe that using CFP data and data mining for Educational purposes is 
even more rare in the literature. As far as we know, this research is the first study in the literature that uses 
CFP data and educational data mining to improve REs’ CFP measurement capabilities. The rest of the paper 
is organized as follows: in the 2nd section, a background on Data Mining, Machine Learning Algorithms, 
CFP and study details are provided. Results are presented in the 3rd section and is followed by conclusions in 
the 4th section. 


2. RESEARCH METHOD 

In this section, first of all, machine learning and CFP methods are explained briefly in the 
subsections 2.1 and 2.2. Second, CFP training need prediction usecase, feature set design and data gathering 
and preparation phases of the study is presented in 2.3, 2.4 and 2.5. Finally, models training details and 
evaluation results are given in 2.6. 


2.1. Data Mining and Machine Learning Algorithms 

Data Mining is defined as “the process of discovering patterns, automatically or semi-automatically, 
in large quantities of data” [12]. Knowledge discovery from data (KDD) is another common term used in the 
literature [13]. Following algorithms which were implemented in Weka [14] are used in this study: 

e Random Forest (RF): This is an ensemble learning method consisting of a set of decision tree 
classifiers. Each tree in the forest is triggered by an independently created random number vector [15]. 

e Naive Bayes (NB): This method uses Bayes’ rule to do the classification by computing class 
probabilities and using observed attribute values. The method is called “naive” since it has two basic 
assumptions: attributes are conditionally independent and no hidden factor impacts on the prediction 
process [16]. 

e REPTree: This is a fast decision tree algorithm that generates a decision tree using information gain 
method to split [17]. Missing values are managed as in C4.5 algorithm [18]. 

e J48: It is a Java implementation of a slightly different version of C4.5 [17]. 

e LMT: Logistic Model Trees are standard decision trees which use logistic regression functions at their 
leaves [19]. 

e Multilayer Perceptron (MLP): MLP is a feed-forward artificial neural network which uses back 
propagation training algorithm. It is a system of interconnected nodes or neurons which maps an input 
vector into an output vector to maintain a nonlinear relation [20]. The neurons are connected via 
weights and output signals [20]. 

e Support Vector Machines (SVM) and Sequential Minimal Optimization (SMO): In linear case, an 
SVM is a hyperplane that set a boundary between some positive instances and negative instances [21]. 
It can also be further extended to non-linear cases [21]. Training an SVM requires quadratic 
programming (QP) optimization problem solving which is a very time and memory consuming 
operation and SMO is a substantial improvement on the original training algorithm [21]. 

e = K-nearest Neighbour Classifier (IBk): It classifies a data point based on its k most similar other data 
points [22]. 

e = ZeroR: It predicts the majority class of nominal test data while it predicts the average value if numeric 
class is the case [12]. In this study, it will be used as a baseline for the performance of machine learning 
algorithms. 

e OneR: This method classifies instances based on a one rule which is extracted from a single 
attribute [23]. 
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2.2. COSMIC Function Point (CFP) 

Software functional size measurement (FSM) has been in use for more than forty years [24]. There 
are many FSM methods [25]. COSMIC Function Point is a new generation software functional size 
measurement method. First version of the method was published in 1988 [26]. It’s one of four ISO certified 
FSM methods which are dominating in the industry: IFPUG, COSMIC, NESMA and Mark II [25]. CFP 
measurement is a three-step process: measurement strategy, mapping and measurement steps. Purpose, scope, 
and level of granularity of the measurement are determined in the first step; in mapping phase, functional 
processes and data groups in the requirements are determined; in the final stage, data movements are 
specified and counted, for all functional processes [26]. CFP is the functional sizing measurement method 
that is used in the company under study [3,7,27]. 


2.3. CFP Requirement Ontology 

Requirement artefacts that will be used in training the machine learning models are instances of 
requirement and CFP ontologies designed in [3]. Currently, this requirement ontology and CFP 
measurements are standard methods used by requirements engineers within the same telecommunications 
company in which this study is conducted. Periodically, a subset of all requirements documents are randomly 
selected and examined by internal audit team manually to identify errors in CFP measurements. After each 
audit, problematic CFP measurements are identified, recorded and potential learning needs are reported to 
requirements engineering management. By this data mining research, the manual examination process by 
audit team is intended to be semi-automated and learning opportunities will automatically be extracted from 
requirements documents. 


2.4. Feature Set Extraction from CFP Requirement Ontology 

CFP Ontology concepts are shown in the second column of the Table 1. Related concepts are 
categorised into concept categories to specify data indicators that will be used in data mining process. 
Ontology concept categories are shown in the first column of Table 1. As a result, features of the data and the 
predicted outcome (Class) of the classification process is shown Table 2. In Table 2, the first seven attributes 
are input attributes and the last one, “CFP Training Need”, is class or Classification result. 


Table 1. CFP Ontology Concept Categories 


Ontology Concept Category Ontology Concept 
Use case Use case 
Use case Application Interaction Diagram 
Interaction Interaction 
Evolution Type Add Evolution Type 
Evolution Type Modify Evolution Type 
Evolution Type Delete Evolution Type 
Application Application Business Module 
Application Application Database Module 
Application Application Service 
Application Application Service Boundary 
Use case Use case Actor 
Use case Use case Event 
Information Information Asset 
Interaction Integration Entry Interaction 
Interaction Integration Exit Interaction 
Interaction User Interface Entry Interaction 
Interaction User Interface Exit Interaction 
Interaction Database Write Interaction 
Interaction Database Read Interaction 
Scope Project Scope 
Scope Application Service Scope 
Not Applicable Productivity Measurement 


Business Logic 
Business Logic 


Use case Business Logic 
Interaction Business Logic 


Table 2. CFP Ontology Indicator set and Their Possible Values 


Ontology Concept Category Value 

Use case Yes, No, Partial 
Interaction Yes, No, Partial 
Information Yes, No, Partial 
Evolution Type Yes, No, Partial 
Business Logic Yes, No, Partial 
Application Yes, No, Partial 
Scope Yes, No, Partial 

CFP Training Need Yes, No 
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2.5. Data Gathering and Preparation 

First seven data attributes shown in Table 2 are obtained from requirements documents by checking 
whether each concept category value is existing or not. If their values are partially existing then the concept 
category value is recorded as “Partial”. The last attribute is captured from audit examination results. If the 
difference between the actual measurement done by requirements engineer and correct measurement result 
identified by audit team is greater than 5 % then this case is recorded as a learning opportunity by recording 
“CFP Training Need” value as “Yes”. 101 data points have been collected and results have been recorded in 
a comma-delimited values (.csv) file. Next, this file has been converted into the attribute-relation file format 
(.arff) which is the standard file format used by The Waikato Environment for Knowledge Analysis (WEKA) 
data mining software [14]. The conversion tool used for .csv to .arff is an online web tool [28]. 


2.6. Model Training and Evaluation 

To train and evaluate the machine learning models, Weka Experimenter [29] has been used. Two 
experiments have been done in Weka. In the first experiment, “Data sets first” parameter checked and 
number of repetitions has been set as 100 in iteration control parameters panel. Experiment type is selected as 
Cross validation. Number of folds attribute is set to 10. Dataset has been selected as the .arff file which is 
created as described in section 2.5. All algorithms which have been explained in section 2.lhave been 
selected in Algorithms panel. Next, the experiment with this configuration has been run on an Intel Core 17- 
5600U CPU, 2.6 GHz, 8 GB RAM and 64-bit Windows Operating System machine. 

The total execution time was 194 seconds and MLP had the slowest running time. Finally, in 
analyse tab of Weka Experimenter user interface, all algorithms have been selected as test base separately 
and test is performed for each algorithm. Test has been repeated for three evaluation metrics: Accuracy 
(Number of Correct Classifications), F-Measure (FM), and Kappa statistic. In the second experiment, 
experiment type was set to Train/Test Percentage Split (data randomized) and train percentage was set to 
66%. All other configuration remained the same as in Experiment 1. In this case, the total execution time was 
74 seconds and MLP had the slowest running time, again. 


3. RESULTS AND ANALYSIS 

Table 3 designates the algorithm performances in terms of accuracy, FM, and Kappa metrics for 
both experiments. Metric values are shown as averages with standard deviations. All algorithms seem 
meaningfully better than ZeroR baseline performance. Support Vector Machines and OneR algorithms have 
the largest average accuracy values. However evaluating the algorithms solely based on the average values 
and standard deviations wouldn’t be sufficient since differences between results might not be statistically 
significant. Therefore, in Weka Experimenter Analyse interface, Significance has been set to 0.05, all 
algorithms have been selected as test base separately and tests have been performed. Statistically significant 
differences have been recorded during tests. Statistical significance is denoted by “v” and “*” symbols in the 
Weka interface. Former means statistically significant better performance while latter implies statistically 
significant worse performance [29]. 

Statistically significant superiorities between algorithms are shown in Table 4. For instance, in 
Experiment 1, Naive Bayes performs better than IBk and ZeroR when Accuracy and Kappa metrics are 
concerned. Best performing algorithms have been determined by comparing the number of all statistically 
significant superiorities. We show this number as “Number of Wins” in Table 4. As a result; REPTree, OneR 
and SVM with SMO algorithms have the maximum “Number of Wins” values and are determined to be the 
top three algorithms performing best in CFP dataset of this study. As far as we know, this is the first study 
that use data mining on requirements and CFP measurement data. Therefore, we couldn’t compare the 
performance of our study with other similar research directly. However if we benchmark with some 
educational data mining studies in general, we see our top performing models are very good in terms of 
accuracy [30-33]. 


Table 3. Algorithm Performances for CFP Dataset 


Algorithm Experiment 1: Cross - validation Experiment 2: 66% Split Test 
Accuracy (%) FM Kappa Accuracy FM Kappa 

Random Forest (RF) 78.79 (11.72) 0.82 (0.11) 0.56 (0.25) 79.20 (5.54) 0.83 (0.04) 0.56 (0.12) 
Naive Bayes (NB) 82.97 (11.13) 0.86 (0.09) 0.63 (0.24) 81.29 (5.46) 0.85 (0.04) 0.60 (0.12) 
REPTree 84.02 (11.25) 0.86 (0.11) 0.67 (0.23) 84.27 (4.81) 0.86 (0.04) 0.68 (0.10) 
J48 (Weka C 4.5 Implementation) 82.59 (11.39) 0.84 (0.11) 0.65 (0.23) 83.38 (4.58) 0.85 (0.04) 0.66 (0.09) 
Logistic Model Trees (LMT) 83.84 (11.34) 0.86 (0.11) 0.67 (0.23) 83.52 (5.21) 0.86 (0.05) 0.66 (0.11) 
Multilayer Perceptron (MLP) 76.74 (12.50) 0.80 (0.12) 0.52 (0.26) 75.94 (6.55) 0.80 (0.05) 0.49 (0.15) 
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Table 3. Algorithm Performances for CFP Dataset 


Experiment 1: Cross - validation Experiment 2: 66% Split Test 


Algorithm Accuracy (%) FM Kappa Accuracy FM Kappa 


Support Vector Machines with SMO 84.16 (11.25) 0.86 (0.11) 0.68 (0.23) 83.70 (4.91) 0.86 (0.04) 0.67 (0.10) 
K-nearest Neighbour Classifier (IBK) 75.61 (11.56) 0.80 (0.10) 0.48 (0.25) 75.38 (5.16) 0.81 (0.04) 0.47 (0.12) 
OneR 84.16 (11.25) 0.86 (0.11) 0.68 (0.23) 84.27 (4.81) 0.86 (0.04) 0.68 (0.10) 
ZeroR 59.45 (01.64) 0.75 (0.01) 0.00 (0.00) 59.37 (0.63) 0.75 (0.04) 0.00 (0.00) 


Table 4. Statistically Significant Superiorities of Algorithms for CFP Dataset 


Al eorithin Experiment 1: Cross - validation Experiment 2: 66% Split Number 
8 Accuracy FM Kappa Accuracy FM Kappa of Wins 
Random Forest (RF) ZeroR ZeroR ZeroR ZeroR ZeroR ZeroR 6 
H MLP, IBk, IBk, 
Naive Bayes (NB) IBk, ZeroR ZeroR ZeroR ZeroR ZeroR ZeroR 10 
RF, 
REPT RF, MLP, MLP, IBk, MLP, IBk, IBk, IBk, 17 
Fe IBk, ZeroR ZeroR IBk, ZeroR ZeroR ZeroR 
ZeroR 
J48 (Weka C 4.5 IBk, IBk, IBk, 
Implementation) IBK, ZeroR, ZeroR ZeroR ZeroR ZETOR ZeroR 10 
MLP. 
ae MLP, IBk, MLP, IBk, ? IBk, IBk, 
Logistic Model Trees (LMT) ZetoR ZeroR Bees ZeroR ZeroR ZeroR 14 
Multilayer Perceptron (MLP) ZeroR ZeroR ZeroR ZeroR ZeroR ZeroR 6 
RF, 
Support Vector Machines with RF, MLP, MLP, IBk, MLP, IBk, ZeroR IBk, 16 
SMO IBk, ZeroR ZeroR IBk, ZeroR ay ZeroR 
ZeroR 
Kengaresi T ur Clässifier ZeroR ZeroR ZeroR ZeroR ZeroR ZeroR 6 
RF, 
OneR RF, MLP, MLP, IBk, MLP, IBk, IBk, IBk, 17 
ne, IBk, ZeroR ZeroR IBk, ZeroR ZeroR ZeroR 
ZeroR 
ZeroR None None None None None None 0 


4. CONCLUSION 
In this study, we conducted an educational data mining research. In the scope of this use case, a CFP 
dataset which was collected from a large telecommunications services and technology company has been 
analysed using 10 machine learning algorithms to identify CFP learning need of Requirements Engineers. 
After two experiments, model performances are evaluated and top performer algorithms have been identified. 
REPTree, OneR and SVM with SMO algorithms performed better than other algorithms in a statistically 
significant manner. Top performing model prediction performances are sufficient to be used in the 
production environment in the company. In the future, following research is planned: 
e Dominating indicators in CFP measurement will be identified by using feature selection algorithms. 
Some new indicators from the requirements artifacts may arise in this process. 
e Data points number will be increased and the study will be replicated by also adding some other 
algorithms such as Adaptive Neuro Fuzzy Inference System (ANFIS). 
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